Learning causal networks with latent variables from multivariate information in genomic data
نویسندگان
چکیده
Learning causal networks from large-scale genomic data remains challenging in absence of time series or controlled perturbation experiments. We report an information- theoretic method which learns a large class of causal or non-causal graphical models from purely observational data, while including the effects of unobserved latent variables, commonly found in many genomic datasets. Starting from a complete graph, the method iteratively removes dispensable edges, by uncovering significant information contributions from indirect paths, and assesses edge-specific confidences from randomization of available data. The remaining edges are then oriented based on the signature of causality in observational data. The approach and associated algorithm, miic, outperform earlier methods on a broad range of benchmark networks. Causal network reconstructions are presented at different biological size and time scales, from gene regulation in single cells to whole genome duplication in tumor development as well as long term evolution of vertebrates. Miic is publicly available at https://github.com/miicTeam/MIIC.
منابع مشابه
Experimental Learning of Causal Models with Latent Variables
This article discusses graphical models that can handle latent variables without explicitly modeling them quantitatively. There exist several paradigms for such problem domains. Two of them are semi-Markovian causal models and maximal ancestral graphs. Applying these techniques to a problem domain consists of several steps, typically: structure learning from observational and experimental data,...
متن کاملUsing multivariate generalized linear latent variable models to measure the difference in event count for stranded marine animals
BACKGROUND AND OBJECTIVES: The classification of marine animals as protected species makes data and information on them to be very important. Therefore, this led to the need to retrieve and understand the data on the event counts for stranded marine animals based on location emergence, number of individuals, behavior, and threats to their presence. Whales are g...
متن کاملCausal Graphical Models with Latent Variables: Learning and Inference
Several paradigms exist for modeling causal graphical models for discrete variables that can handle latent variables without explicitly modeling them quantitatively. Applying them to a problem domain consists of different steps: structure learning, parameter learning and using them for probabilistic or causal inference. We discuss two well-known formalisms, namely semi-Markovian causal models a...
متن کاملDiscovery of Causal Models that Contain Latent Variables Through Bayesian Scoring of Independence Constraints
Discovering causal structure from observational data in the presence of latent variables remains an active research area. Constraint-based causal discovery algorithms are relatively efficient at discovering such causal models from data using independence tests. Typically, however, they derive and output only one such model. In contrast, Bayesian methods can generate and probabilistically score ...
متن کاملAutomatic Discovery of Latent Variable Models
Much of our understanding of Nature comes from theories about unobservable entities. Identifying which hidden variables exist given measurements in the observable world is therefore an important step in the process of discovery. Such an enterprise is only possible if the existence of latent factors constrains how the observable world can behave. We do not speak of atoms, genes and antibodies be...
متن کامل